excess loss
Stab-SGD: Noise-Adaptivity in Smooth Optimization with Stability Ratios
In the context of smooth stochastic optimization with first order methods, we introduce the stability ratio of gradient estimates, as a measure of local relative noise level, from zero for pure noise to one for negligible noise. We show that a schedulefree variant (Stab-SGD) of stochastic gradient descent obtained by just shrinking the learning rate by the stability ratio achieves real adaptivity to noise levels (i.e.